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Abstract — In this paper we invoke a nonanticipative informa- 
tion Rate Distortion Function (RDF) for sources witli memory, 
and we analyze its Importance in probabilistic matching of the 
source to the channel so that transmission of a symbol-by-symbol 
code with memory without anticipation is optimal, with respect to 
an average distortion and excess distortion probability. We show 
achievability of the symbol-by-symbol code with memory without 
anticipation, and we evaluate the probabilistic performance of the 
code for a Markov source. 

I. Introduction 

We consider a nonanticipative information Rate Distortion 
Function (RDF) for sources with memory, and we investigate 
its importance in joint source-channel coding JSCC with 
emphasis on symbol-by-symbol code with memory without 
anticipation (e.g. the encoder and decoder at each time i 
process samples independently, with memory on past symbols, 
and without anticipation with respect to symbols occurring at 
times j > i). The aim is to match probabilistically the source 
to the channel, and evaluate its performance with respect 
to average distortion and excess distortion probability. For 
memoryless sources and channels, necessary and sufficient 
conditions for symbol-by- symbol transmission are given in 1 1 1 
(see also ||2)) 

In this paper, we first observe that a necessary condition 
for probabilistic matching of a source with memory to the 
channel so that symbol-by-symbol transmission with memory 
without anticipation is feasible, is the realization of the optimal 
reproduction distribution by a cascade of an encoder-channel- 
decoder processing information causally. Consequently, we 
consider a nonanticipative information RDF which is real- 
izable in the above sense, and we proceed to obtain the 
closed form expression of the reproduction distribution which 
achieves the infimum over the fidelity set. Moreover, we 
prove under certain conditions involving the nonanticipative 
information RDF, and the capacity of certain channels with 
memory and feedback, that symbol-by-symbol code with 
memory without anticipation is achievable. 

Finally we evaluate the performance of a stationary ergodic 
Markov source using symbol-by-symbol uncoded transmission 
(e.g., the encoder and decoder are unitary operations to their 
inputs), with the channel replaced by the optimal reproduction 
conditional distribution of the nonanticipative RDF (e.g., the 
source is not matched to the channel), by computing an upper 
bound on the excess distortion probability using a variation of 




Fig. 1. Communication sclieme witli feedback. 

Hoeffding's inequality [3]. Finally we note that nonanticipative 
information RDF is investigated by the authors in the context 
of realizable filters in |4|, where examples are given for multi- 
dimensional partially observable Gaussian processes. 

II. Symbol-by-Symbol codes with Memory 
Without Anticipation 

In this section we define the elements of a symbol-by- 
symbol code with memory without anticipation. 

Let N = {0,1,...}, N" = {0,1,..., n}. The spaces 
X,A,B,y denote the source output, channel input, channel 
output, and decoder output alphabets, respectively, which 
are assumed to be complete separable metric spaces (Polish 
spaces) to avoid excluding continuous alphabets. We define 
their product spaces by Ao.„ — x"^qA', Ao.n — x"=o-^' 

Bo.^ = ^?=oB, yo,n = x?=o3^- Let x" = {xq, xi, . . . , rr"} G 
Ao.n denote the source sequence of length n, and similarly for 
channel input, channel output, decoder (reproduction) output 
sequences, a" £ Ao,n, &" G ■Bq,,!, y" G 3^o,ri, respectively. 
We associate the above product spaces by their measurable 
spaces, as usual. Next, we introduce the various distributions 
of the blocks appearing in Fig IT] 

Definition II. 1. (Source) The source is a sequence of condi- 
tional distributions {PxAxi-i-{dxi\x''~^) : Vi G N"} defined 
by 
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Definition II.2. (Encoder) The encoder is a sequence of con- 



ditional distributions {PAi\Ai 
Vi G N"} defined by 
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Thus, the encoder is nonanticipative in the sense that at 
each time i £ W\ PA.^Ai-iBi-ixi{dai\a^~^,b^~^,x^) is a 
measurable function of past and present symbols x* € Xqa 
and past symbols a*^^ e AQ.i-i,b''^^ G So.i-i- 

Definition II.3. (Channel) The channel is a sequence of 
conditional distributions {PgAgi-i j^ixi{dbi\b''~^ ,a^ ,x^) : 
Vi e N"} c/e^net/ Z^y 



i^B"|A",X"(rf^'"|a",2;") 



A 



':=aPBAB^-\A^MdhW-',a\x') 



Thus the channel has memory, feedback and it is nonantic- 
ipative with respect to the source sequence. 

Definition II.4. (Decoder) The decoder is a sequence of con- 
ditional distributions {PY.|yi-i^i(dj/i|y'~^, 6') : Vi G N"} 

Definitions |II. 1||II.4| of source-encoder-channel-decoder are 
general, they have memory and feedback without anticipation, 
hence we call the source-channel code symbol-by-symbol code 
with memory without anticipation. Given the source, encoder, 
channel, decoder, we can define uniquely the joint measure by 



= K=oPY^\Y^-\B^{dy^W~^ :V) 

® Pb,\b^-\A',x^ idh\b'-\a\ x') 



(S) PAM^-\B^-\x4dai\a' \b' \x') (g> Px,\x^-i{dx.,\x' ^). 

(1) 

Thus, we have indirectly assumed the following Markov chains 
(MCs) hold. 

{A'-\ B''-\Y'-^) ^ X'-^ ^ X„ VieN" (2) 

(3) 
(4) 
(5) 



Y'-' ^ {A'-\B'-\x') ^ A,, yiem'' 

Y'-^ ^{A\B'-\X')^ B,, ViGN" 
{A\X')^{B\Y'-^)^Yi, ViGN". 



The distortion function between the source and its reproduc- 
tion is a measurable function (io,n : Xo^n x 3^o,n ^-> [0, oo), 

n 

where (T'x",T*y") are the shift operations on (a;",2/"), 
respectively. For a single letter distortion function we take 
po,i(T*a;",r'j/") — p{xi,yi). The cost of transmitting sym- 
bols over the channel is a measurable function 

n 
1=0 

Next, we state the definition of a symbol-by-symbol code with 
memory without anticipation. 

Definition II.5. (Symbol-by- Symbol code with 

Memory without Anticipation). An (n,d,e,P) symbol- 
by-symbol code with memory without anticipation 



for 
is a 



f-^O.i 



,-Ao,n,Bo,n,yo,niPx"-, Pi 



code {Pa.|a-i,B'-i,x-(-|-) : Vz G N"}, 
{Py.|yi-i B»('|') ■ Vi G N"} with excess distortion 
probability 

P{rfo.n(x", y") >{n + l)d} < e, e e (0, 1), d > 

and transmission cost :^^E< Co,„(A", F"^^) > < P, P > 0. 

Definition II.6. (Minimum Excess Distortion) The minimum 
excess distortion achievable by a symbol-by-symbol code with 
memory without anticipation (n, d, e, P) is defined by 

D°{n, e, P) = inf < d : 3{n, d, e, P) symbol-by- symbol code 

with memory without anticipation > 

Our definition of symbol-by-symbol code with memory 
without anticipation is randomized, hence it embeds determin- 
istic codes as a special case 12). 

III. Nonanticipative Versus Classical RDF 

In this section, we first establish the claim that the clas- 
sical RDF for sources with memory, is not the appropriate 
measure for lossy compression in symbol-by-symbol codes 
with memory without anticipation. Recall that the necessary 
conditions for transmission of symbol-by-symbol codes with 
memory without anticipation (this is also true for memoryless 
sources and channels) are the following. 

1) Realization of the optimal reproduction distribution of 
lossy compression with fidelity by an encoder-channel- 
decoder scheme, processing information causally; 

2) Computation of the RDF and that of the optimal repro- 
duction distribution so that probabilistic matching of the 
source and channel is feasible. 

Consider the average fidelity set 



1 



y"|X" 



1 
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Here, Px"{-) is the source distribution and PY'^\x'^{^\x"') 
is the reproduction distribution, and it is known that for 
stationary ergodic source and single letter distortion, the OFTA 
is given by the RDF pi 

RiD) = lim Ro,n{D) (6) 

Ro,n{D)^ inf ^/(X";y") (7) 

provided the infimum is achievable. It is also well known that 
if the infimum in (|7| exists, then |[5) 



Py„|X"(dy"|x") 



(8) 



lyo e^*'-"(^"'S'")Py..(dj/" 

where s G (— oo,0] is the Lagrange multiplier associated with 
the fidelity set Qo,7i{D). Clearly, by Bayes' rule 



Pi 



Y"\X' 



.{dy^lx^^^t^oP^ 



Yi\X",Y 



.-i{dy,\x^,f) (9) 



and hence the optimal reproduction yi at time i of Xj depends 
on the past reproductions and past and present source symbols 
{y^^^, x^}, and the future source symbols {xi+i, . . . x"}, n > 
i. Thus, in general the optimal reproduction distribution is 
anticipative with respect to the source symbols, and hence 
it is not realizable in the sense described earlier. Moreover 
for sources with memory it is very difficult to compute the 
value of R{D). Even for the Binary Symmetric Markov Source 
(BSMS) the exact expression of R{D) is not known |61. The 
independent source and Gaussian source are exception. 

Now, we introduce the nonanticipative information RDF 
which by construction is realizable, and in Section llV] we 
compute its closed form expression. Given a source Px^ {dx") 
and a causal conditional distribution defined by 



i^ 



A 



Y"\X' 



.(dy"|x") = ®r=o^y^n 



,x 



{dy^W 



(10) 



then the joint distribution Py^.x^ and marginal distribution 
Pyn are uniquely defined. Introduce the information measure 
(D(.|.) denotes the relative entropy). 



= Ix^^yAPx^ 
Consider the fidelity set defined by 

1 






X'^) 



Y^\X" 



n+1 



doA^'^^yn 



Xo,„xyo,„ 



^oAD) = [P 

l^Y'^lxAdy^'W) <E> Px'^idx") < d}. (11) 

Next, we introduce the nonanticipative information RDF. 

Definition III.l. (Nonanticipative Information RDF) Given 
Qo,n{D), the nonanticipative information RDF is defined by 



Ro^niD) 



A 



T^v 



inf 



n + 1 



\iX"^Y'' 



.(Px",i^y.MX"; 



(12) 



and its rate by R'^'^{D) = liiii„_j.oo _Rq"j(Z3) provided infitnum 
and the limit exist. 

Clearly, if the minimum of i?Q° (Z?) exists the optimal re- 
production distribution is nonanticipative, and hence realizable 
in the sense described before. 

Next, draw the connection between R^'^^{D), i?o, «(£*), and 
Gorbunov-Pinsker definition of nonanticipatory e— entropy, 
by first introducing the following equivalent statements of 
conditional independence. 

Lemma III.2. (Equivalent Statements of Nonanticipation) The 
following are equivalent for i = 0, 1 . . . , n — 1, Vn G N. 

1) Xf_^i O {X\ y'-i) O Y^ forms a MC; 

2) Xi+i O X* O y ' forms a MC; 

3) Xf_^^ o X* o y ' forms a MC; 

4) Py„|x4rfy"|a;") - 7^y„|X"(rfy"k")- 

Proof: The equivalency of 1), 2), 4) is easy. If 3) holds 
then Pxr^^^\x\Y^{dx\\^\x\y'') = Pxr^^^\x^{dx'^j^^\x') and 



hence 2) is obtained by integration. By induction one can show 
that 2) implies 3). ■ 

Cleai-ly, R'^^D) ^ Ro.n{D)- Next, we discuss the relation 
between Rq°^{D) and Gorbunov and Pinsker ||7| nonantici- 
patory e— entropy. Gorbunov and Pinsker |T|, restricted the 
fidelity set Qo,n(^) to those reproduction distributions which 
satisfy the MC of Lemma |III.2| 3), and introduced the nonan- 
ticipatory e-entropy defined by [7| 

RIAD)= ^ ini _^ J_/(X";r") 



X-" 



inf 

Py"ix"eQo,,>(£') 

, -(-^X'^-J-F', i=0,l,... 



1 



(13) 



and the nonanticipatory message generation of the source by 
R'^{D) = lim„_>.oo Rq „(£*) provided the infimum exists and 



the limit is finite. The MC in (13 i means that the reproduc- 
tion distribution which minimizes ( pjj ) can be realized via 
an encoder-channel-decoder, using nonanticipative operations 
(causal). 
In view of Lemma |III.2| we have the following theorem. 

Tlieorem III.3. (Equivalent Nonanticipative RDF) The fol- 
lowing holds 



RIAD) = R^CiD), Vn e N. 



(14) 



Proof: If any of the statements of Lemma III.2| hold then 
/(X";r") =/(P,",^y"|jf")'andthefideUtysetis([ll}. ■ 

IV. Solution Nonanticipative RDF 
In this section we give the expression of the nonanticipative 



reproduction distribution which achieves the infimum in ( 12 1. 
First, we note that in view of Theorem |III. 3 [ the results derived 
in fl\ are appUcable for i?Jf'Jj(i:)), i?"''(D), and these results 
include sufficient conditions for stationary sources to give an 
optimal reproduction distribution corresponding to stationary 
source-reproduction pair {{Xi, Yi) : i = 0, 1, . . .}. 

Thus, under the conditions in |7) or assuming the solution 
of Rq1^{D) gives an optimal nonanticipative reproduction 

distribution which is stationary, and hence PYr^\x^{dy"'\x^) 
is an (n + 1)— fold convolution of stationary conditional 
distributions, we have the following theorem. 

Theorem IV.l. Suppose there exist an interior point of the 
fidelity set, and the optimal reproduction is stationary. Then 
the infimum over Qo,n(£') '« ( |-/2| is attained by 

^sp(rx-,ry-)p* {dy,\f-^) 

Py.^x^Ady \x )-^^=Oj^^,MT^.^,T^y^)p*,^^^^_^^dy,\y^-l) 

(15) 

where s < is the Lagrange multiplier associated with the 
constraint which is satisfied with equality, and 






sp(rx"',T'y") 



P;-^^Y^-iidy^\f-'))^Px,lX'-^{dx,\x^-') 
®Ph-KY'-^idx'-\dy'-^) (16) 



Proof: The derivation is given in BJ. ■ 

The point to be made regarding the optimal reproduction 

distribution is that, it is nonanticipative, and as we show in the 

next section, easy to compute, even for sources with memory. 

V. Coding Theorem 

In this section we show achievability of symbol-by-symbol 
code with memory without anticipation. We also note that in 
view of the equivalence R'o.niD) = i?o.«(^)' that R'^'^„{D) 
is the OPTA by sequential code (see |8|). 

The probabilistic realization of the optimal reproduction 
distribution by an encoder-channel-decoder, is necessary for 
probabilistic matching of the source and the channel. Next, 
we give the precise definition of the realization. 

Definition V.l. (Realization) Given a source {PxAX^-^ 
{dxi\x^^^) : Vi S N"}, a general channel {PBi\Bi-^.A^.xi 
(d&j|6*~^, a', X*) : Vi € N"} is a realization of the optimal 
reproduction distribution {-Py. lyi-i j(-i((i2/i|2/'~"'^,x*) : V« € 
N"} of theorem IV.l if there exists a pre-channel encoder 
{-Pa.|A'-i,b»-i,X' (dai|a*~^,6*~\a;*) -.^fi eN"^} and a post- 
channel decoder {-Pyjyi-i^Bi {dyi\y''^^ ,b^) : Vi € N"} such 
that 



7^;„i;,„(cfj/"ix") = ®r=o^;^iy.-i,;f= 



{dy,\y''\x' 



= ^7=0PY,\Y'-^,x4dy^\f-\x') (17) 



where the joint distribution from which ( |i7p is obtained is 
precisely (T/l. Moreover we say that Rq°^{D) is realizable if 
in addition the realization operates with average distortion D 
and 7p,„(Px",^F"|X") = R'^liD) 

If the optimal reproduction distribution is realizable (see 
Definition |V.l[i, then the data processing inequaUty holds: 



Ixr^^YAPx^.'^Y-\x-) < I{X" ^ B"), Vn e N. (18) 

If Rq1^{D) is realizable according to Definition IV. ll then 
the source is not necessarily matched to the channel. Next, 
we prove (under certain conditions) achievability, by first 
introducing the information definition of channel capacity. 
Consider the following average cost set defined by 

I n + 1 

Since we consider the general scenario that (|2]l-(j5]l hold, then 
we define the information channel capacity from the source to 
the channel output as follows |_9J. 



^)}<^}- 



CoAP) 



A 



sup — 

{X",A"')eVo.„(P) "- 



1 



/(X" 



B") 



and its rate (provided sup is finite and the limit exists) by 
C(P)=lim„^^Co,„(P). 

Next, we prove achievability of a symbol-by-symbol code. 

Theorem V.2. (Achievability of Symbol-by-Symbol Code with 
Memory Without Anticipation). 
Suppose the following conditions hold. 



1) Rq°^{D) has a solution and the optimal reproduction 
distribution is stationary. 

2) Co,n{P) hos a solution and the maximizing processes 
are stationary. 

3) The optimal reproduction distribution PY"\X"{dy"\x") 



given by Theorem IV.l is realizable, and Rq°^(^D) is also 
realizable. 
4) There exists D and P such that R^^^P') = Co.niP)- 



If 



n 

»{ Y, poAT'x^, ry") >{n + i)d} < 



(19) 



4 = 

is taken with respect to PY".x"{dy"',dx^) = 
Y"\X" {dy"'\x"')(EiPx" {dx") then there exists an (n, d, e, P) 



where 

symbol-by-symbol code with memory without anticipation. 



Proof: The derivation is similar to 11]. If conditions 
(1), (3) hold then the optimal reproduction distribution is 
realizable, and this realization achieves R^°^{D). By (4) the 
source is matched to the channel so that the excess distortion 
probability of a symbol-by-symbol code with memory without 
anticipation satisfies ( [T9| . ■ 

1) Symbol-by-Symbol Code: It can be shown that if the 
source is Markov, and the channel is Markov with respect to 
the source, satisfying 



1) P. 



2) P, 



Xi\X^- 



i{xi\x'- 



P 



Xi\Xi 



_,{X^\X^^1), VzeN" 



Bi\Bi~^,A\X 



{dh\b'-\a\x') 
= PB.\B^-KA.,xAdb^\b'-\a„x,), VzeN", 
then maximizing directed information I{X" — > 5") over 
non-Markov encoders {PAilA'-'^.B'-^.X' • * = 0,1,..., n} 
is equivalent to maximizing it over encoders {PAi|B'-i.Xi ■ 
i = 0,1,..., n}, and similarly, maximizing I{X" -^ B") 
over non-Markov deterministic encoders {ei(a;*, a*^^, y*^^) : 
i — 1, . . . , n} is equivalent to the maximization with respect 
to encoders {gi{xi, y'^^) : i = I, . . . , n}. This result appeared 
in |10|. Thus, based on these two conditions the encoder is 
symbol-by-symbol Markov with respect to the source, and 
nothing can be gained by considering an encoder that depends 
on the entire past of the source causally. 

VI. Application 

In this section we consider the Binary Symmetric Markov 
source, for which the classical RDF is unsolved and only 
bounds are known. Then we show that the solution of the 
nonanticipative information RDF can be obtained relatively 
easy. Subsequently, we evaluate the performance of uncoded 
transmission. It is shown that even this uncoded, unmatched 
scheme, although sub-optimal ensures the excess distortion 
probability goes to zero. 

Consider a Binary Symmetric Markov Source (BSMS(p)), 
P{x^ = 0|a;,_i = 0) = Pix^ = l|a;,_i = 1) = 1 - p 
and P{xi — l|a;i_i = 0) = P{xi = 0\xi^i = 1) = p and 
i — 0, 1, . . . , n. We apply a single letter Hamming distortion 
criterion p{x, y) — if x = y and p{x, y) — I if x ^ y. The 
objective is to compute R"°-{D). 
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Fig. 2. The distortion between the source and reproduction symbols for 
a random realization of the source, as a function of n using the optimal 
reproduction distribution as the channel and uncoded transmission. 



p=0.39 




Fig. 3. Excess Probability of Distortion for 5 = 0.01. 

Proposition VI. 1. For a BSMS(p) and single letter distortion 
criterion we have 



BJ"'{D) = 



H{m)-H{D) ifD<\ 







otherwise 



where m = 1 — p — D + 2pD. 



Proof: We describe the main steps. The steady state 
distribution of the source is P{Xi = 0) = P{Xi = 1) = 0.5 
and the reproduction distribution is 



yi|x%F»-i 



p* 

^Yi\Xi,Y^~'^ 



and we can show that P, 



Yi\Xi,Yi 



-1 — Py\x y ^"'^ '-'^^'■ 



■PY-.|x.,y._i(2/da;j,?/*-i) = 



0,0 


0,1 


1,0 


1,1 


a 


/? 


1-/3 


1-a 



capacity is not necessarily matched to the source RDF. The 
matching is part of on-going research and it could be possible 
by adding a cost constrain on the channel. A realization of 
the described scheme is shown in Fig. |2], where it is verified 
that as the number of channel uses n is increased, the single 
letter distortion between the source symbol sequence and the 
reproduction sequence converges to the average distortion D. 
Next, we bound the excess distortion probability of Theorem 



l~a 1-/3 P 



where a 



_ (l-p)(l--D) fj_ p{l~D) 



l-p-D+2p£)' 



(3 



p+D-2pD- 



Next, we discuss symbol-by-symbol uncoded transmission 
over a channel characterized via the optimal reproduction 
distribution. This approach is suboptimal since the channels 



V.2 by applying an extension of Hoeffding's inequality for 
MCs Q, which bounds the probability of a function of a 
Markov source. It can be shown that {Z^ = (Fj, X^) : Vi G N} 
is Markov. Set p{x, y) = x(By and let Sn = X]"=o Pi^iiYi). 
Let d — 5+ ^"' , (5 > 0. By Hoeffding's inequality, the excess 
distortion probability is bounded by 

X'iin + l)S-2\\f\\m/Xf 



P{5„ > (n + l)rf} <cxp( 



2(n + 1)11/11 m2 



where ||/|| = 1, tti = 1, A = niin{p, 1 — p} min{a, /?, 1 — 
a,l — /3}, for n > 2||/||7tt./(A(5). This bound is illustrated 
in Fig. [3] Although, this bound is not tight and holds for n 
large enough, it shows the achievability of Markov sources 
via uncoded transmission. It might be possible to compute 
the excess distortion probability in closed form to get tighter 
bounds. 

VII. Conclusions 

This paper considers nonanticipative information RDF and 
discusses its application to General Source-Channel Matching, 
generalizing earlier results on uncoded transmission to random 
processes with memory and nonanticipative feedback. 
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